28 research outputs found
Augmentation Backdoors
Data augmentation is used extensively to improve model generalisation.
However, reliance on external libraries to implement augmentation methods
introduces a vulnerability into the machine learning pipeline. It is well known
that backdoors can be inserted into machine learning models through serving a
modified dataset to train on. Augmentation therefore presents a perfect
opportunity to perform this modification without requiring an initially
backdoored dataset. In this paper we present three backdoor attacks that can be
covertly inserted into data augmentation. Our attacks each insert a backdoor
using a different type of computer vision augmentation transform, covering
simple image transforms, GAN-based augmentation, and composition-based
augmentation. By inserting the backdoor using these augmentation transforms, we
make our backdoors difficult to detect, while still supporting arbitrary
backdoor functionality. We evaluate our attacks on a range of computer vision
benchmarks and demonstrate that an attacker is able to introduce backdoors
through just a malicious augmentation routine.Comment: 12 pages, 8 figure
To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression
As deep neural networks (DNNs) become widely used, pruned and quantised models are becoming ubiquitous on edge devices; such compressed DNNs are popular for lowering computational requirements.Meanwhile, recent studies show that adversarial samples can be effective at making DNNs misclassify. We, therefore, investigate the extent to which adversarial samples are transferable between uncompressed and compressed
DNNs. We find that adversarial samples remain transferable for both pruned and quantised models.For pruning, the adversarial samples generated from heavily pruned models remain effective on uncompressed models.
For quantisation, we find the transferability of adversarial samples is highly sensitive to integer precision.Partially supported with funds from Bosch-Forschungsstiftung im Stifterverban
Wide Attention Is The Way Forward For Transformers?
The Transformer is an extremely powerful and prominent deep learning
architecture. In this work, we challenge the commonly held belief in deep
learning that going deeper is better, and show an alternative design approach
that is building wider attention Transformers. We demonstrate that wide single
layer Transformer models can compete with or outperform deeper ones in a
variety of Natural Language Processing (NLP) tasks when both are trained from
scratch. The impact of changing the model aspect ratio on Transformers is then
studied systematically. This ratio balances the number of layers and the number
of attention heads per layer while keeping the total number of attention heads
and all other hyperparameters constant. On average, across 4 NLP tasks and 10
attention types, single layer wide models perform 0.3% better than their deep
counterparts. We show an in-depth evaluation and demonstrate how wide models
require a far smaller memory footprint and can run faster on commodity
hardware, in addition, these wider models are also more interpretable. For
example, a single layer Transformer on the IMDb byte level text classification
has 3.1x faster inference latency on a CPU than its equally accurate deeper
counterpart, and is half the size. We therefore put forward wider and shallower
models as a viable and desirable alternative for small models on NLP tasks, and
as an important area of research for domains beyond this